Caching Patterns in System Design
π Core Conceptsβ
Cache Hit vs Cache Missβ
| Term | Definition | Speed | What Happens |
|---|---|---|---|
| Cache Hit β | Requested data found in cache | Fast | Data retrieved directly from cache |
| Cache Miss β | Requested data not in cache | Slow | Must fetch from main storage, then optionally cache it |
Analogyβ
Think of cache as your desk drawer and main memory as a library shelf:
- Cache hit: Find your notebook in the desk drawer β instant access β‘
- Cache miss: Not in drawer β walk to library shelf β slower retrieval π
π Caching Patternsβ
1. Cache-Aside (Lazy Loading)β
βββββββββββββββ
β Application β
ββββββββ¬βββββββ
β
βΌ
Check Cache?
ββ Hit β Return
ββ Miss β Fetch DB β Store in Cache β Return
Characteristics:
- Application manages cache explicitly
- Cache populated on-demand
- Most common pattern
Pros:
- β Only caches what's actually needed
- β Cache failure doesn't break the system
- β Flexible - app has full control
Cons:
- β First request always slow (cache miss)
- β Extra code in application layer
- β Potential for stale data
When to Use:
- Read-heavy workloads
- When you want fine-grained control
- General-purpose caching (Redis, Memcached)
Real-World Example: E-commerce product catalog caching
2. Read-Throughβ
βββββββββββββββ
β Application β ββReadββ> βββββββββ
βββββββββββββββ β Cache β ββAuto Fetchββ> Database
βββββββββ
Characteristics:
- Cache layer handles data loading
- Transparent to application
- Cache acts as abstraction over DB
Pros:
- β Simpler application code
- β Centralized cache logic
- β Consistent read interface
Cons:
- β First request still slow
- β Adds complexity to cache layer
- β Tighter coupling with cache system
When to Use:
- When you want cache to own data loading
- Frameworks that support it (Ehcache, Caffeine)
- Microservices with dedicated cache service
Difference from Cache-Aside: Cache handles DB fetch vs application handles it
3. Write-Throughβ
βββββββββββββββ
β Application β ββWriteβββ¬ββ> Cache (sync)
βββββββββββββββ βββ> Database (sync)
Characteristics:
- Every write goes to both cache and DB
- Synchronous double-write
- Strong consistency guaranteed
Pros:
- β Cache always fresh and consistent
- β No risk of stale reads
- β Simple consistency model
Cons:
- β Higher write latency (two operations)
- β Wasted writes for rarely-read data
- β Cache can fill with cold data
When to Use:
- Strong consistency requirements
- Read-after-write scenarios common
- Financial systems, user profiles
Real-World Example: User session data, account balances
4. Write-Behind / Write-Backβ
βββββββββββββββ
β Application β ββWriteββ> Cache (fast return) ~~async batch~~> Database
βββββββββββββββ
Characteristics:
- Writes happen to cache only
- DB updated later in batches
- Eventually consistent
Pros:
- β Extremely fast writes
- β Can batch/coalesce multiple writes
- β Reduces DB load significantly
Cons:
- β Risk of data loss if cache fails
- β Complexity in failure handling
- β Eventual consistency only
When to Use:
- High write throughput needed
- Acceptable to lose recent writes on failure
- Analytics pipelines, logging systems
Real-World Example: Page view counters, metrics aggregation
5. Write-Aroundβ
Write: Application ββββββββββ> Database (bypass cache)
Read: Application ββ> Cache? ββMissββ> Database ββ> Cache
Characteristics:
- Writes skip the cache entirely
- Cache populated only on reads
- Prevents cache pollution
Pros:
- β Avoids caching rarely-read data
- β Keeps cache focused on hot data
- β Better cache hit ratio for actual reads
Cons:
- β First read after write always misses
- β Higher latency for read-after-write
- β Not ideal for write-then-read patterns
When to Use:
- High write volume, low read volume
- Write-once-read-never scenarios
- Log ingestion, data warehousing
Real-World Example: Event logging, audit trails
6. Refresh-Ahead (Proactive Caching)β
Cache monitors TTL ββ> Preemptively refreshes BEFORE expiration
Characteristics:
- Predictive cache warming
- Reduces cache misses for hot data
- Requires usage pattern prediction
Pros:
- β Minimizes cache misses
- β Consistent low latency
- β Great for predictable access patterns
Cons:
- β Wastes resources on cold data
- β Complex implementation
- β Needs good prediction algorithm
When to Use:
- Frequently accessed data with predictable patterns
- Low-latency requirements (gaming, trading)
- Content delivery networks (CDN)
Real-World Example: Homepage content, trending articles
7. TTL (Time-to-Live) Basedβ
Cache Entry [Created] ββ(time passes)ββ> [TTL Expires] ββ> Auto-removed
Characteristics:
- Time-based expiration
- Simplest invalidation strategy
- Combined with other patterns
Pros:
- β Simple to implement
- β Prevents indefinitely stale data
- β Works with any caching pattern
Cons:
- β Can cause cache miss storms at expiration
- β Arbitrary time selection
- β May evict still-valid data
When to Use:
- Data with known freshness requirements
- Combined with most caching strategies
- Session tokens, temporary data
Real-World Example: API rate limiting, JWT tokens
ποΈ Eviction Policiesβ
LRU (Least Recently Used)β
- Strategy: Evicts items not accessed recently
- Best for: Temporal locality (recently used = likely to be used again)
- Example: Web page caching
LFU (Least Frequently Used)β
- Strategy: Evicts items accessed least often
- Best for: Popular content, frequency-based access
- Example: Video streaming platforms
FIFO (First In First Out)β
- Strategy: Evicts oldest entries
- Best for: Simple queue-like behavior
- Example: Basic message queues
Random Replacementβ
- Strategy: Evicts random entries
- Best for: When no clear pattern exists, lowest overhead
- Example: Simple distributed caches
π Pattern Comparison Matrixβ
| Pattern | Write Speed | Read Speed | Consistency | Complexity | Data Loss Risk |
|---|---|---|---|---|---|
| Cache-Aside | π‘ Medium | π’ Fast* | π‘ Eventual | π’ Low | π’ Low |
| Read-Through | π‘ Medium | π’ Fast* | π‘ Eventual | π‘ Medium | π’ Low |
| Write-Through | π΄ Slow | π’ Very Fast | π’ Strong | π’ Low | π’ None |
| Write-Back | π’ Very Fast | π’ Very Fast | π‘ Eventual | π΄ High | π΄ High |
| Write-Around | π’ Fast | π‘ Medium | π‘ Eventual | π’ Low | π’ None |
| Refresh-Ahead | π‘ Medium | π’ Very Fast | π‘ Eventual | π΄ High | π’ Low |
*After initial cache miss
π― Common Pattern Combinationsβ
High-Traffic Web Applicationβ
Read Strategy: Cache-Aside + LRU eviction
Write Strategy: Write-Through for critical data
TTL: 5-15 minutes for most content
Tools: Redis, Memcached
Analytics Pipelineβ
Read Strategy: Read-Through
Write Strategy: Write-Back (batch inserts)
Eviction: LFU (frequently queried reports)
Tools: Apache Ignite, Hazelcast
E-commerce Product Catalogβ
Read Strategy: Cache-Aside + Refresh-Ahead for bestsellers
Write Strategy: Write-Around for inventory updates
TTL: 1 hour for product details
Tools: Redis with pub/sub for invalidation
Social Media Feedβ
Read Strategy: Cache-Aside + Refresh-Ahead for active users
Write Strategy: Write-Back for likes/views
TTL: 30 seconds for feed items
Eviction: LRU
Tools: Redis Cluster
π³ Decision Treeβ
ββ Need strong consistency?
β ββ YES β Write-Through
β ββ NO β
β
ββ High write volume?
β ββ YES β
β β ββ Can tolerate data loss?
β β β ββ YES β Write-Back
β β β ββ NO β Write-Around
β ββ NO β Cache-Aside
β
ββ Need ultra-low read latency?
β ββ Add Refresh-Ahead
β
ββ Cache filling up?
ββ Choose eviction:
ββ Temporal patterns β LRU
ββ Popularity-based β LFU
β Best Practicesβ
-
Start with Cache-Aside
- Most flexible and widely understood
- Easy to debug and reason about
-
Always Set TTL
- Even with other invalidation strategies
- Prevents unbounded cache growth
-
Monitor Cache Hit Ratio
- Aim for >80% for effectiveness
- Alert on sudden drops
-
Handle Cache Failures Gracefully
- App should work even if cache is down
- Implement circuit breakers
-
Use Appropriate Serialization
- Consider Protobuf/MessagePack over JSON
- Faster and more compact
-
Warm Critical Caches on Startup
- Don't wait for cold starts
- Pre-populate frequently accessed data
-
Implement Cache Stampede Protection
- Use locks/semaphores for cache misses
- Prevent thundering herd
-
Size Your Cache Appropriately
- Monitor eviction rates
- Balance memory cost vs hit rate
β‘ Performance Tipsβ
- Batch operations when possible (especially with Write-Back)
- Use pipeline/multi-get for multiple keys (Redis MGET, MSET)
- Consider cache-aside for writes even with read-through for reads
- Implement circuit breakers for cache failures
- Use connection pooling for cache clients
- Monitor P99 latencies, not just averages
- Compress large values before caching
- Use appropriate data structures (Redis Hashes, Sets, Sorted Sets)
β οΈ Common Pitfallsβ
β Cache Stampedeβ
Problem: Multiple requests reload same expired data simultaneously
Solution:
- Locking mechanisms (distributed locks)
- Early recomputation (refresh before expiry)
- Probabilistic early expiration
β Stale Dataβ
Problem: Cache inconsistent with database
Solution:
- Proper TTL settings
- Invalidation on writes
- Event-driven cache updates
β Cache Pollutionβ
Problem: Rarely-used data fills cache
Solution:
- Write-Around pattern
- Better eviction policies (LRU/LFU)
- Cache only frequently accessed data
β Over-cachingβ
Problem: Caching everything indiscriminately
Solution:
- Profile and measure what to cache
- Cache only expensive queries
- Monitor cache hit rates per key pattern
β No Monitoringβ
Problem: Not knowing hit rates, evictions, or issues
Solution:
- Implement comprehensive metrics
- Dashboard for cache health
- Alerts for anomalies
β Ignoring Cache Warm-upβ
Problem: Cold start causes poor initial performance
Solution:
- Pre-populate cache on deployment
- Gradual traffic ramping
- Keep cache instances alive during deployments
π Key Metrics to Monitorβ
| Metric | What It Measures | Target |
|---|---|---|
| Hit Rate | % of requests served from cache | >80% |
| Miss Rate | % of requests requiring DB fetch | <20% |
| Eviction Rate | How often data is removed | Low & stable |
| Memory Usage | Cache memory consumption | <80% capacity |
| Latency (P50, P99) | Response time distribution | <10ms P99 |
| Throughput | Operations per second | Application dependent |
| Connection Pool | Active connections | Stable |
| Error Rate | Failed cache operations | <0.1% |
π οΈ Popular Cache Technologiesβ
In-Memory Cachesβ
- Redis - Feature-rich, supports data structures, persistence
- Memcached - Simple, fast, lightweight
- Hazelcast - Distributed, Java-based, compute capabilities
Application-Level Cachesβ
- Caffeine - High-performance Java cache library
- Ehcache - Java cache with disk persistence
- Guava Cache - Simple in-process cache for Java
CDN/Edge Cachesβ
- CloudFlare - Global CDN with edge caching
- AWS CloudFront - Integrated with AWS services
- Fastly - Real-time CDN with VCL customization
Distributed Cachesβ
- Apache Ignite - Distributed database and cache
- Aerospike - High-performance distributed cache
- Couchbase - Document DB with built-in caching
π Further Readingβ
- Redis Documentation: https://redis.io/docs/
- Memcached Wiki: https://github.com/memcached/memcached/wiki
- AWS Caching Best Practices: https://aws.amazon.com/caching/
- Martin Fowler on Caching: https://martinfowler.com/
- Google SRE Book - Caching: https://sre.google/sre-book/
- Designing Data-Intensive Applications by Martin Kleppmann (Chapter 3)
π Quick Reference Cheat Sheetβ
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β WHEN TO USE WHICH PATTERN β
βββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Read-Heavy + Control β Cache-Aside β
β Strong Consistency β Write-Through β
β High Write Throughput β Write-Back β
β Rarely Read After Write β Write-Around β
β Predictable Hot Data β Refresh-Ahead β
β Time-Sensitive Data β TTL-Based β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β EVICTION POLICY SELECTION β
βββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Recent = Relevant β LRU β
β Frequency Matters β LFU β
β Simple Queue β FIFO β
β No Pattern / Testing β Random β
βββββββββββββββββββββββββββββββββββββββββββββββββββ